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1.  Introduction 


The  two- parameter  Welbull  distribution  has  found  many  applications 
in  the  engineering  and  in  the  biological  sciences.  For  instance,  it  has 
been  used  by  Cook,  Doll  and  Fellingham  (1969)  and  by  Doll  (1971),  to  des- 
cribe the  observed  age  distribution  of  many  human  cancers.  Its  use  for 
describing  failures  of  electrical  and  mechanical  components  is  well  docu- 
mented in  the  reliability  literature. 

In  this  note  we  address  ourselves  to  a fundamental  problem  involving 
any  application  of  the  Welbull  distribution.  We  wish  to  test  the  null  hy- 
pothesis that  a given  random  sample  belongs  to  a Welbull  distribution  with 
unknown  parameters.  Of  the  several  methods  for  testing  "goodness  of  fit," 
those  based  on  the  sample  distribution  function  happen  to  be  the  most 
popular.  We  shall  present  tables  of  critical  values  for  testing  the  null 
hypothesis  in  question,  and  also  give  some  results  comparing  the  power  of 
our  tests  and  a test  due  to  Mann,  Scheuer  and  Fertig  (1973). 

A foundation  for  developing  our  tables  of  critical  values  is  the 
recently  given  theory  by  Durbin  (1973),  and  by  Serfling  and  Wood  (1976) 
on  the  weak  convergence  of  an  "empirical"  stochastic  process.  This 
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stochastic  process  is  based  on  the  sample  distribution  function  and 
estimates  of  the  unknown  parameters.  The  statistics  that  we  discuss 
can  be  represented  as  well-behaved  functionals  of  this  empirical  process. 
Thus,  the  asymptotic  distributions  of  the  relevant  test  statistics  can  be 
obtained  as  the  distributions  of  the  corresponding  functionals  of  the  lim- 
iting process.  The  above  ideas  will  be  made  clear  in  the  following  text. 


2.  Preliminaries 


The  two-parameter  Weibull  distribution  is  given  by 


F(t)  = 1 - exp 


= 0 


[-  (if]  • ' i » 


(2.1) 


otherwise ; 


the  scale  parameter  6 and  the  shape  parameter  B are  both  assumed  to 
be  positive. 


If  we  make  the  transformation  X = -Jin  T , where  T has  a two- 
parameter  Weibull  distribution,  then  the  distribution  of  X is  called 
the  extreme  v^la^  distribution.  It  is  given  by 

F(x)  = exp^-exp  - » (2.2) 

where  a = -In  6 and  b = -^  . We  note  that  a and  b are.  respectively 

p 

the  location  and  the  scale  parameters  of  the  extreme  value  distribution. 

The  tests  that  we  discuss  in  this  paper  are  based  on  the  extreme 

value  distribution.  To  make  a test  of  fit  to  the  Weibull  distribution  we 

shall  first  take  the  negative  of  the  natural  logarithms  of  the  supposed 

Weibull  data.  'Fhus,  we  wish  to  consider  the  case  of  testing  whether  the 

distribution  of  a random  sample  X ,X-,...,X  is  an  extreme  value  distri- 

12  n 

but  ion  with  unknown  location  parameter  a and  unknown  scale  parameter  b 
Specifically,  we  wish  to  test  the  "null  hypothesis" 

: F(x)  = expj^-exp  - (^^)]  . 
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for  all  X and  for  some  (a,b)  . 


When  a and  b are  specified,  then  is  "simple,"  and  our 

test  reduces  to  testing  the  hypothesis  that  the  independent  random 


variables 


{-f-]  = - (-h-)]  • 


1 < i < n , 


have  a common  uniform  (0,1)  distribution.  The  Kolmogorov-Smirnov  test 


is  based  on  the  statistic 


n sup  |G  (t)  - t|  , 
0<t<l 


where 


i j,  '({V]  i ')  • 


0 < t < 1 , 


(2.3) 


(2.4) 


where  1(E)  denotes  the  indicator  of  the  event  E . Under  the  null  hy- 
pothesis, the  "empirical"  stochastic  process 


W (t)  = n"'"  [G  (t)  - t]  , 
n n 


0 < t < 1 


satisfies 


W — > W in  P[0,1]  , 
n 


(2.5) 


(2.6) 


where  — > denotes  convergence  in  distribution  and  W denotes  the 
Gaussian  process  determined  by 


E[w'^(t)]  = 0 , 


0 < t < 1 


E [W^  (s)  (t)  ] = min(s,t)  - st  , 0 _<  s , t £ 1 . 

P[0,1]  denotes  the  space  of  functions  on  [0,1]  which  are  right-continuous 
and  have  left-hand  limits. 

In  the  following  section  we  present  some  results  on  an  analogous  test 
statistic  for  the  case  composite.  These  results  will  serve  as  a basis 

for  developing  the  tables  of  critical  values. 
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3.  The  Convergence  Theorem  and  the  Test  Statistic 


When  a and  b are  not  specified,  that  is,  when 

ite,  we  consider  an  analogous  approach  based  on  (a  ,6  ) , 

n n 

likelihood  estimators  of  (a,b)  . We  set 

X.  - a 


1 < i < n 


Hq  Is  compos- 
the  maximum 


and  analogous  to  G and  W we  define 
n n 


H (t)  = ^ I I[G(Y  .)  < t]  , 0 < t < 1 

n n n,l  — — — 

and 

1/2 

V^(t)  = n^'  [H^(t)  - t]  , 0 < t < 1 . 


Our  theorem  pertains  to  the  "empirical"  stochastic  process  V (t)  , 

n 

and  is  analogous  to  the  result  given  by  Equation  (2.6).  However,  before 
stating  the  convergence  theorem,  we  will  have  to  Introduce  the  following 
notation  given  In  Durbin  (1973),  and  verify  that  his  assumptions  (condi- 
tions) are  satisfied. 

Let  us  denote  by  0 the  vector  [a,b]'  , and  let  9^  be  any  con- 
veniently chosen  value  of  0 . We  state  below  a verification  of  the  re- 
quired conditions. 


Condition  A:  The  distribution  G(x,9q)  has  a density  f(x,6Q)  such 
that,  for  almost  all  x , the  vector  91ogf (x, 0^) /99q  exists,  and  satisfies 

plogf(x,0Q)  91ogf(x,0Q)'^ 

" \ ■ ^0  / ' ’ 
where  J is  finite  and  positive  definite. 
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Condition  B:  Let  0^  be  the  maximum  likelihood  estimator  of  0 ; that 
is,  9^  = [a^,6^]'  . Then,  it  is  well  known  (cf.  Cramer  (1946)]  that 


1/2  . . 1 .-1  ? 3logf(x  ,0  ) 

" <V»o>  ■ -iTI  J I Sb; S 

n 1=1  0 


where  ->•  0 , in  probability. 


Condition  C:  Let  N be  the  closure  of  a neighborhood  of  0^  . Let 

g(t,0)  = 3G(x,0)/90  when  this  is  expressed  as  a function  of  t by  means 
of  the  transformation  t = G(x,0)  ; let  g(t)  = gCt,©^)  . The  vector  func- 
tion g(t,0)  is  continuous  in  (6,t)  for  all  0eN  , and  0 < t < 1 . 


Theorem  3.1:  By  virtue  of  Conditions  A,  B,  and  C,  the  "empirical"  process 

fx-a 


determined  by  the  extreme  value  distribution  G 
the  maximum  likelihood  estimators,  is  such  that 

0 

V ~>  V in  P[0,1]  , 


, with  (a  ,b  ) 
n n 


vlicre  is  a Gaussian  process  determined  by 

E[v”(t)]  = 0 , 

and 


0 < t < 1 


E[V^(s)V^(t) ] = min(s,t)  - st  - g(s)'J~^g(t)  , 0 < s,  t < 1 . (3.1) 


Proof : Follows  from  Durbin  (1973). 


// 


If  we  choose  0^  = [0,1]'  , then  it  can  be  verified  that  g(t)  = 
[tlogt,  -tlogt  log(-logt)]  , and  that 

, 1.10867  0.257 

J ■ 

0.257  0.60793 


- 5 
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[cf.  Johnson  and  Kotz  (1970),  p.  282].  Substituting  the  above  Into  (3.1) 
we  have  the  covariance  of  our  Gaussian  process 

E[V*^(s)V^(t)  ] = rain(s,t)  - st  - 1. 108(slogs)  (tlogt) 

+ 0. 257 (slogs) (tlogt  log(-logt)) 

(3.2) 

+O.257(slogs  log(-logs) (tlogt)) 

- 0.60793(slogs  log(-logs)  tlogt  log(-logt))  , 0 _<  s,  t < 1 . 
The  statistics  of  interest  in  connection  with  are: 

(i)  the  one-sided  Kolmogorov  statistic 


+ _ _ 

D 

sup 

V (t)  , 

(3.3) 

n 

0<t<l 

n 

D 

n 

-inf 

0<t<l 

v„(t)  , 

(3. A) 

(il)  the  Kolmogorov-Smirnov  statistic 

D = max(D'^,D")  , (3.5) 

n n n 

(iii)  the  Kupicr  statistic 

V = + d"  , (3.6) 

n n n 
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Using 

the 

fact 

that 

if  h^V^(t)^  is  a function 

of 

V (t)  tliat 
n 

is  continuous 

with  respect 

to  the  Skorkhod  metric  on 

D[0, 

1 ] , the  limit 

laws  of  d"^  , 
n 

D~ 

n 

. D 

n 

, V 

n 

2 2 2 

, W , U , and  A under 

n n n 

«0 

are  given. 

respectively , 

by 

the 

laws 

of  the  random  variables 

+ 

D 

— 

sup  V^(t)  , 

(3.10) 

0<t<l 

D~ 

— 

-inf  v‘^(t)  , 

(3.11) 

0j<t<l 

D 

= 

max(D'*^,D  ) , 

(3.12) 

V 

= 

d''’  + D~  , 

(3.13) 

o 

W“ 

/ (v^t))^dt  , 

(3.14) 

0 

> 

u“ 

/ (v°(t)dt)^ -["/  v'^(t)dtl 

2 

(3.15) 

0 Lo  J 

and 

= 

(v^t))' 

£->0  O+i;  ’ 

(3.16) 

The  above  results  follow  as  a consequence  of  the  continuous  mapping 
theorem  of  Billingsley  (1968).  They  provide  a basis  for  Monte  Carlo  stud- 
ies of  the  null  hypothesis  asymptotic  distributions  of  the  statistics  dis- 
cussed above. 

4 . Sampling  Distribut ions  of  the  Approximate 
Test  Statistics 


Monte  Carlo  methods  were  used  to  simulate  the  distribution  of  the 
limiting  random  variables  given  in  Equations  (3.10)  through  (3.16).  Fol- 
lowing Serfling  and  Wood  (1976)  we  approximate  the  Caussian  process 
by  its  finite-dimensional  distributions,  corresponding  to  an  evaluation 
of  the  process  at  29,  99,  and  119  equally-spaced  points  in  the  unit 
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interval.  One  thousand  multivariate  normal  random  vectors  with  the  co- 
variance  given  by  Kquation  (i.2)  were  generated  using  a program  from  the 
International  Mathematical  and  Statistical  Library.  The  empirical  distri- 
butions of  the  sup  remum,  the  infimum,  and  the  difference  between  the  su- 
premum  and  the  infimum  ot  the  resulting  multivariate  normal  vectors  were 

tlien  tabulated,  thus  approximating  the  limit  laws  of  d"*"  , D , D , and 

n n n 

. Since  the  differences  in  the  observed  quantiles  corresponding  to 

the  finite-dimensional  distributions  of  at  29,  99,  and  119  equally- 

spaced  points  diminished,  the  approximating  procedure  was  terminated  at 

2 2 

119  equal ly-spaced  points.  The  asymptotic  distributions  of  W , U , 

and  were  obtained  by  using  numerical  integration  techniques.  For 

this  we  used  Subroutine  QSF  from  the  IBM  Scientific  Subroutine  Package. 

Tlie  various  sample  quantiles  for  the  generated  frequency  distributions 
are  shown  in  Table  4.1. 


5 . Tl le  Mann- Scheiier- Fertig  (MSF)  Test 

Tlie  only  other  known  procedure  for  testing  goodness  of  fit  for  the 
WeibulL  that  is  not  based  on  the  empirical  distribution  function  is  a test 
proposed  by  Mann,  Scheuer,  .ind  Fertig  (1973). 

Tlie  MSF  test  is  based  on  a statistic  S , and  can  be  used  for  cen- 
sored as  well  as  uncensored  samples.  However,  the  percentage  points  of 
S and  certain  quantities  that  are  used  in  calculating  S are  available 
only  for  sample  sizes  of  up  to  25.  However,  along  with  a modification 
en  by  Stephens  (1977),  the  test  statistics  we  discuss  can  be  used  for 
any  sample  size. 


For  a sample  of  size  n , censored  at  m , the  statistic  S is 
defined  as 


m-1 

I (X  - X ) / IE(Y  ) - E(Y  )] 

g = i=lm/2]+l  ^ ^ 

m- 1 

<*14-1  - Xl'  ' - '=<L>I 


- 8 - 
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i 

where  = — g — and  [r]  denotes  the  greatest  Integer  contained  in  r . 

Mann,  Scheuer,  and  Fertig  give  percentage  points  of  S and  the  values  of 
the  quantities  for  samples  of  size  3 to  25. 


6.  Power  Comparisons 

In  order  to  evaluate  the  effectiveness  of  the  tests  discussed  be- 
fore, we  evaluate  their  power,  against  the  lognormal  distribution  as  an 
alternative.  Tlie  lognormal  distribution  is  chosen  because  it  appears  to 
be  a natural  competitor  to  a Weibull  distribution.  The  power  comparisons 
were  made  numerically.  For  this  random  samples  of  size  20,  25,  and  30, 
respectively,  were  generated  from  a lognormal  (normal)  distribution  with 
parameters  -0.5  (mean)  and  1.00  (variance),  respectively. 


Maximum  likelihood  estimators  of  the  parameters  a and  b of 
the  extreme  value  distribution  were  obtained  by  numerically  solving  the 
following  equations  simultaneously; 


= I X./n 
• J 
J 


I cxp(- 


LJ 


X./b) 
J 


r -I'l 

I 'l  exp(-X./b) 

- L.1  ^ . 


(6.1) 


and 


-blog 


I exp(-X 


j/b)/nj  . 


(6.2) 


The  results  of  our  power  comparisons  are  shown  in  Tables  6.1, 

6.2,  and  6.3,  and  these  an>  based  on  1000  replicates.  Based  on  this 
limited  experiment,  it  appears  that  for  samples  of  sizes  20  and  25,  the 
MSF  test  has  better  power.  For  samples  of  size  30,  the  MSF  test  could 
not  be  used,  and  the  Anderson-Darling  test  appears  to  have  better  power. 

7 . Concluding  Remarks 

After  finishing  the  work  on  this  report  we  were  informed  that 
Stephens  (1977)  has  also  obtained  asymptotic  percentage  points  for  the 
2 2 2 

statistics  W . U , and  A . Stephens  also  gives  a necessary  modifi- 
cation so  as  to  use  these  statistics  for  a finite  sample  sizes.  Even  though 
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r 

I 

1 

our  approach  is  different,  it  is  encouraging  to  note  that  our  results 
seem  to  be  in  good  agreement  with  those  of  Stephens.  A comparison  of 
the  asymptotic  points  we  obtained  with  those  of  Stephens  is  given  in 

I 

Table  7.1.  Stephens  has  made  no  power  comparisons,  and  since  our  re- 
sults agree  quite  well  with  his,  we  conclude  that  our  power  comparisons 
remain  valid. 
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